Cross-Modal Retrieval: A Pairwise Classification Approach
نویسندگان
چکیده
Content is increasingly available in multiple modalities (such as images, text, and video), each of which provides a different representation of some entity. The cross-modal retrieval problem is: given the representation of an entity in one modality, find its best representation in all other modalities. We propose a novel approach to this problem based on pairwise classification. The approach seamlessly applies to both the settings where ground-truth annotations for the entities are absent and present. In the latter case, the approach considers both positive and unlabelled links that arise in standard cross-modal retrieval datasets. Empirical comparisons show improvements over state-of-theart methods for cross-modal retrieval.
منابع مشابه
Cross-Modal Learning via Pairwise Constraints
In multimedia applications, the text and image components in a web document form a pairwise constraint that potentially indicates the same semantic concept. This paper studies cross-modal learning via the pairwise constraint, and aims to find the common structure hidden in different modalities. We first propose a compound regularization framework to deal with the pairwise constraint, which can ...
متن کاملPairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval
With benefits of low storage cost and fast query speed, crossmodal hashing has received considerable attention recently. However, almost all existing methods on cross-modal hashing cannot obtain powerful hash codes due to directly utilizing hand-crafted features or ignoring heterogeneous correlations across different modalities, which will greatly degrade the retrieval performance. In this pape...
متن کاملUnsupervised Generative Adversarial Cross-modal Hashing
Cross-modal hashing aims to map heterogeneous multimedia data into a common Hamming space, which can realize fast and flexible retrieval across different modalities. Unsupervised cross-modal hashing is more flexible and applicable than supervised methods, since no intensive labeling work is involved. However, existing unsupervised methods learn hashing functions by preserving inter and intra co...
متن کاملImage Retrieval: Content versus Context
In this paper, we introduce a new approach to image retrieval. This new approach takes the best from two worlds, combines image features (content) and words from collateral text (context) into one semantic space. Our approach uses Latent Semantic Indexing, a method that uses co-occurrence statistics to uncover hidden semantics. This paper shows how this method, that has proven successful in bot...
متن کاملCross-Modal Similarity Learning via Pairs, Preferences, and Active Supervision
We present a probabilistic framework for learning pairwise similarities between objects belonging to different modalities, such as drugs and proteins, or text and images. Our framework is based on learning a binary code based representation for objects in each modality, and has the following key properties: (i) it can leverage both pairwise as well as easy-to-obtain relative preference based cr...
متن کامل